Optimal feedback control of dynamical systems via value-function approximation
نویسندگان
چکیده
A self-learning approach for optimal feedback gains finite-horizon nonlinear continuous time control systems is proposed and analysed. It relies on parameter dependent approximations to the value function obtained from a family of universal approximators. The cost functional training an approximate law incorporates two main features. First, it contains average over objective values parametrized ensemble initial values. Second, adapted exploit relationship between maximum principle dynamic programming. Based approximation properties, existence, convergence first order optimality conditions neural network controllers are proved.
منابع مشابه
Controller design and value function approximation for nonlinear dynamical systems
This work considers the infinite-time discounted optimal control problem for continuous time input-affine polynomial dynamical systems subject to polynomial state and box input constraints. We propose a sequence of sum-of-squares (SOS) approximations of this problem obtained by first lifting the original problem into the space of measures with continuous densities and then restricting these den...
متن کاملBatch Value Function Approximation via Support Vectors
We present three ways of combining linear programming with the kernel trick to find value function approximations for reinforcement learning. One formulation is based on SVM regression; the second is based on the Bellman equation; and the third seeks only to ensure that good moves have an advantage over bad moves. All formulations attempt to minimize the number of support vectors while fitting ...
متن کاملValue function approximation via low-rank models
We propose a novel value function approximation technique for Markov decision processes. We consider the problem of compactly representing the state-action value function using a low-rank and sparse matrix model. The problem is to decompose a matrix that encodes the true value function into low-rank and sparse components, and we achieve this using Robust Principal Component Analysis (PCA). Unde...
متن کاملOptimal control of entanglement via quantum feedback
It has recently been shown that finding the optimal measurement on the environment for stationary Linear Quadratic Gaussian control problems is a semi-definite program. We apply this technique to the control of the EPR-correlations between two bosonic modes interacting via a parametric Hamiltonian at steady state. The optimal measurement turns out to be nonlocal homodyne measurement — the outpu...
متن کاملOptimal synthesis via superdifferentials of value function
We derive a differential inclusion governing the evolution of optimal trajectories to the Mayer problem. The value function is allowed to be discontinuous. This inclusion has convex compact right-hand sides.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Comptes rendus
سال: 2023
ISSN: ['1873-7234']
DOI: https://doi.org/10.5802/crmeca.199